Ladder Tagger-Splitting Decision Space to Boost Tagging Quality
نویسندگان
چکیده
This paper describes a part of speech tagger. The tagger is based on a set of probability mixture models. Each mixture model is responsible for tagging of a specific class of words, sharing similar context properties. Probability mixture models contain 25 various mixture components. The tagger is tested on Polish language and compared to other available taggers.
منابع مشابه
بررسی مقایسهای تأثیر برچسبزنی مقولات دستوری بر تجزیه در پردازش خودکار زبان فارسی
In this paper, the role of Part-of-Speech (POS) tagging for parsing in automatic processing of the Persian language is studied. To this end, the impact of the quality of POS tagging as well as the impact of the quantity of information available in the POS tags on parsing are studied. To reach the goals, three parsing scenarios are proposed and compared. In the first scenario, the parser assigns...
متن کامل7 A Hybrid Grammatical Tagger :
In this chapter we discuss in detail how a piece of software can carry out automatically one important task in corpus annotation. The task is part-of-speech (POS) tagging (also called word-class tagging, or grammatical tagging); that is, assigning to each word in a text its correct part of speech in context. The result of this task, as a form of corpus annotation , was discussed in some detail ...
متن کاملEstimation of Conditional Probabilities With Decision Trees and an Application to Fine-Grained POS Tagging
We present a HMM part-of-speech tagging method which is particularly suited for POS tagsets with a large number of fine-grained tags. It is based on three ideas: (1) splitting of the POS tags into attribute vectors and decomposition of the contextual POS probabilities of the HMM into a product of attribute probabilities, (2) estimation of the contextual probabilities with decision trees, and (3...
متن کاملProbabilistic Part-of-Speech Tagging Using Decision Trees
In this paper, a new probabilistic tagging method is presented which avoids problems that Markov Model based taggers face, when they have to estimate transition probabilities from sparse data. In this tagging method, transition probabilities are estimated using a decision tree. Based on this method, a part-of-speech tagger (called TreeTagger) has been implemented which achieves 96.36 % accuracy...
متن کاملReductionistic, Tree and Rule Based Tagger for Polish
The paper presents an approach to tagging of Polish based on the combination of handmade reduction rules and selecting rules acquired by Induction of Decision Trees. The general open architecture of the tagger is presented, where the overall process of tagging is divided into subsequent steps and the overall problem is reduced to subproblems of ambiguity classes. A special language of constrain...
متن کامل